The Effect of Algorithms on Copy Number Variant Detection

نویسندگان

  • Debby W. Tsuang
  • Steven P. Millard
  • Benjamin Ely
  • Peter Chi
  • Kenneth Wang
  • Wendy H. Raskind
  • Sulgi Kim
  • Zoran Brkanac
  • Chang-En Yu
چکیده

BACKGROUND The detection of copy number variants (CNVs) and the results of CNV-disease association studies rely on how CNVs are defined, and because array-based technologies can only infer CNVs, CNV-calling algorithms can produce vastly different findings. Several authors have noted the large-scale variability between CNV-detection methods, as well as the substantial false positive and false negative rates associated with those methods. In this study, we use variations of four common algorithms for CNV detection (PennCNV, QuantiSNP, HMMSeg, and cnvPartition) and two definitions of overlap (any overlap and an overlap of at least 40% of the smaller CNV) to illustrate the effects of varying algorithms and definitions of overlap on CNV discovery. METHODOLOGY AND PRINCIPAL FINDINGS We used a 56 K Illumina genotyping array enriched for CNV regions to generate hybridization intensities and allele frequencies for 48 Caucasian schizophrenia cases and 48 age-, ethnicity-, and gender-matched control subjects. No algorithm found a difference in CNV burden between the two groups. However, the total number of CNVs called ranged from 102 to 3,765 across algorithms. The mean CNV size ranged from 46 kb to 787 kb, and the average number of CNVs per subject ranged from 1 to 39. The number of novel CNVs not previously reported in normal subjects ranged from 0 to 212. CONCLUSIONS AND SIGNIFICANCE Motivated by the availability of multiple publicly available genome-wide SNP arrays, investigators are conducting numerous analyses to identify putative additional CNVs in complex genetic disorders. However, the number of CNVs identified in array-based studies, and whether these CNVs are novel or valid, will depend on the algorithm(s) used. Thus, given the variety of methods used, there will be many false positives and false negatives. Both guidelines for the identification of CNVs inferred from high-density arrays and the establishment of a gold standard for validation of CNVs are needed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance evaluation of block-based copy- move image forgery detection algorithms

Copy-move forgery is a particular type of distortion where a part or portions of one image is/are copied to other parts of the same image. This type of manipulation is done to hide a particular part of the image or to copy one or more objects into the same image. There are several methods for detecting copy-move forgery, including block-based and key point-based methods. In this paper, a method...

متن کامل

Microduplication of Xp22.31 and MECP2 Pathogenic Variant in a Girl with Rett Syndrome: A Case Report

Rett syndrome (RS) is a neurodevelopmental infantile disease characterized by an early normal psychomotor development followed by a regression in the acquisition of normal developmental stages. In the majority of cases, it leads to a sporadic mutation in the MECP2 gene, which is located on the X chromosome. However, this syndrome has also been associated with microdeletions, gene translocations...

متن کامل

Estimating the Number of Wideband Radio Sources

In this paper, a new approach for estimating the number of wideband sources is proposed which is based on RSS or ISM algorithms. Numerical results show that the MDL-based and EIT-based proposed algorithm havea much better detection performance than that in EGM and AIC cases for small differences between the incident angles of sources. In addition, for similar conditions, RSS algorithm offers hi...

متن کامل

Hidden Markov Model-Based CNV Detection Algorithms for Illumina Genotyping Microarrays

Somatic alterations in DNA copy number have been well studied in numerous malignancies, yet the role of germline DNA copy number variation in cancer is still emerging. Genotyping microarrays generate allele-specific signal intensities to determine genotype, but may also be used to infer DNA copy number using additional computational approaches. Numerous tools have been developed to analyze Illu...

متن کامل

Assessment of mitochondrial DNA copy number in peripheral blood leukocyte of opiate abusers and healthy individuals

Background: Based on the studies, variation in the mitochondrial DNA (mtDNA) copy number in peripheral blood leukocytes is associated with increased susceptibility to diseases including cancer. Opiate abusers are at high risk for diseases. In this study, we measured the mtDNA copy number in peripheral blood leukocytes in a group of opiate abusers compared with those in healthy individuals. Met...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2010